I.Map

A map chart is used to show items on a background that is often geographical.

Map charts in R can contain different type of information in one or more different layers. Maps can contain interactive shapes or display markers of different types on an image or map background.

Steps to draw a map using R: 1). Find the required map data, such as latitude and longitude, boundary information, etc., and load it into R; 2). Data reconstruction meets the requirements of function to draw the map.

The package maps comes with maps of the world, maps of Italy, Canada, the United States, France, New Zealand and other regions. But the available maps are limited by the R package.

I.1 Static Map
library(maps)
library(ggmap)
## Loading required package: ggplot2
## Google's Terms of Service: https://cloud.google.com/maps-platform/terms/.
## Please cite ggmap if you use it! See citation("ggmap") for details.
library(leaflet)
library(htmlwidgets) # replacing leaflets
library(RColorBrewer)

The ggmap package can link to Google Map, OpenStreetMap and Stamen Maps, and provide latitude and longitude, address, distance and other related information. The package ggmap needs to register to get the API key first.

map("world", fill = TRUE, col = 'lightBlue',
    ylim = c(-60, 90), mar = c(0, 0, 0, 0))

library(RColorBrewer)
map('state', region = c('california', 'oregon', 'washington'),
    fill = TRUE, col = colorRampPalette(brewer.pal(8,'Pastel1'))(3), mar = c(2, 3, 4, 3))

II.2 Interactive Map

You need to specify the coordinates to draw a certain area and zoom in

leaflet() %>%  
addTiles() %>%
setView( lng = -95.09, 
         lat = 29.55,
         zoom = 10) %>%
addProviderTiles("NASAGIBS.ViirsEarthAtNight2012")  ## operated by the NASA data

III. Choropleth Map

A choropleth map is a type of thematic map in which areas are shaded or patterned in proportion to a statistical variable that represents an aggregate summary of a geographic characteristic within each area, such as population density or per-capita income.

Choropleth maps provide an easy way to visualize how a measurement varies across a geographic area or show the level of variability within a region. A heat map or isarithmic map is similar but does not use a priori geographic areas. They are the most common type of thematic map because published statistical data (from government or other sources) is generally aggregated into well-known geographic units, such as countries, states, provinces, and counties.

library(RColorBrewer)
library(rgdal)
## Loading required package: sp
## Please note that rgdal will be retired by the end of 2023,
## plan transition to sf/stars/terra functions using GDAL and PROJ
## at your earliest convenience.
## 
## rgdal: version: 1.5-27, (SVN revision 1148)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.2.3, released 2021/04/27
## Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/rgdal/gdal
## GDAL binary built with GEOS: TRUE 
## Loaded PROJ runtime: Rel. 7.2.1, January 1st, 2021, [PJ_VERSION: 721]
## Path to PROJ shared files: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/rgdal/proj
## PROJ CDN enabled: FALSE
## Linking to sp version:1.4-5
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading sp or rgdal.
## Overwritten PROJ_LIB was /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/library/rgdal/proj
library(ggplot2)
library(ggmap)
library(viridis)
## Loading required package: viridisLite
## 
## Attaching package: 'viridis'
## The following object is masked from 'package:maps':
## 
##     unemp
library(leaflet)
library(htmltools)
library(mapproj)
library(usmap)
library(ggplot2)
library(data.table)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ tibble  3.1.6     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.4     ✓ stringr 1.4.0
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ✓ purrr   0.3.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::between()   masks data.table::between()
## x dplyr::filter()    masks stats::filter()
## x dplyr::first()     masks data.table::first()
## x dplyr::lag()       masks stats::lag()
## x dplyr::last()      masks data.table::last()
## x purrr::map()       masks maps::map()
## x purrr::transpose() masks data.table::transpose()
library(geofacet)
head(statepop)
## # A tibble: 6 × 4
##   fips  abbr  full       pop_2015
##   <chr> <chr> <chr>         <dbl>
## 1 01    AL    Alabama     4858979
## 2 02    AK    Alaska       738432
## 3 04    AZ    Arizona     6828065
## 4 05    AR    Arkansas    2978204
## 5 06    CA    California 39144818
## 6 08    CO    Colorado    5456574
plot_usmap(data = statepop, values = "pop_2015", color = "white") + 
  scale_fill_continuous(name = "Population 2015", label = scales::comma) + 
  theme(legend.position = "right")

plot_usmap(data = countypop, values = "pop_2015", color = "white") + 
  scale_fill_continuous(name = "Population (2015)", label = scales::comma) + 
  theme(legend.position = "right")

We can even use map and ggplot together to built up the plot

library(tidyverse)
library(ggplot2)
library(maps)
library(usmap)
state = map_data("state")
head(state)
##        long      lat group order  region subregion
## 1 -87.46201 30.38968     1     1 alabama      <NA>
## 2 -87.48493 30.37249     1     2 alabama      <NA>
## 3 -87.52503 30.37249     1     3 alabama      <NA>
## 4 -87.53076 30.33239     1     4 alabama      <NA>
## 5 -87.57087 30.32665     1     5 alabama      <NA>
## 6 -87.58806 30.32665     1     6 alabama      <NA>
state$region=tolower(state$region)
statepop$full=tolower(statepop$full)
state_pop <- merge(as_tibble(state), statepop, by.x = "region", by.y="full")
p <- ggplot() +
  geom_polygon( data=state_pop,
                aes(x=long, y=lat, group=group, 
                    fill = pop_2015/1000000), 
          color="white", size = 0.2) +
  scale_fill_continuous(name="State Population", 
                        low = "#ffe6cc",
                        high = "#ffa700",
                        limits = c(0,40),
                        breaks=c(5,10,15,20,25,30,35), 
                        na.value = "grey50")
p

IV. Cartogram Heatmap

The cartogram heatmap does not express the value by changing the shape or size of the map like cartogram. Each area cartogram heatmap in is represented by a square of the same size, and the value is expressed by color. It is widely used in the data of the states in the United States than other regions.

if(!require(statebins)) install.packages("statebins")
## Loading required package: statebins
library(statebins)
library(tidyverse)
library(viridis)
covid <- read.csv("covid_states.csv")
head(covid)
##   submission_date state tot_cases conf_cases prob_cases new_case pnew_case
## 1      02/12/2021    UT    359641     359641          0     1060         0
## 2      03/01/2021    CO    438745     411869      26876      677        60
## 3      08/22/2020    AR     56199         NA         NA      547         0
## 4      07/17/2020    MP        37         37          0        1         0
## 5      08/12/2020    AS         0         NA         NA        0         0
## 6      05/22/2021    MA    704796     659246      45550      451        46
##   tot_death conf_death prob_death new_death pnew_death             created_at
## 1      1785       1729         56        11          2 02/13/2021 02:50:08 PM
## 2      5952       5218        734         1          0 03/01/2021 12:00:00 AM
## 3       674         NA         NA        11          0 08/23/2020 02:15:28 PM
## 4         2          2          0         0          0 07/19/2020 12:00:00 AM
## 5         0         NA         NA         0          0 08/13/2020 02:12:28 PM
## 6     17818      17458        360         5          0 05/23/2021 01:37:59 PM
##   consent_cases consent_deaths
## 1         Agree          Agree
## 2         Agree          Agree
## 3     Not agree      Not agree
## 4         Agree          Agree
## 5                             
## 6         Agree          Agree
#preparation 
covid$state<-as.character(covid$state)  
covid$tot_death<-as.numeric(covid$tot_death)            

#draw the plot
statebins(covid,             #dataframe
          state_col="state",           #the name of state
          value_col = "tot_death",     #the number of death
          round=TRUE,                             
          dark_label = "white",         #dark label as white
          light_label = "black")+       #light label as black
  scale_fill_viridis(option = "E",begin = 0.95,end = 0,name="Deaths")+ #the color of fill
  labs(title = "COVID-19 U.S. Cumulative Deaths by December 12,2021",    #title and caption
       caption="Source: Centers for Disease Control and Prevention")+
  theme_statebins(legend_position = "right")    

We could seperate the total deaths into sub-groups

##preparation
covid$tot_death[which(covid$tot_death<=100)]<-1                      
#group the number of tot_death
covid$tot_death[which(covid$tot_death>100&covid$tot_death<=1000)]<-2
covid$tot_death[which(covid$tot_death>1000&covid$tot_death<=5000)]<-3
covid$tot_death[which(covid$tot_death>5000&covid$tot_death<=10000)]<-4
covid$tot_death[which(covid$tot_death>10000&covid$tot_death<=30000)]<-5
covid$tot_death[which(covid$tot_death>30000)]<-6
covid$tot_death<-as.factor(covid$tot_death)  # change them to factor

ggplot(covid,aes(state = state, fill=tot_death))+               
  geom_statebins()+                                           
  scale_fill_brewer(palette = 3,                              
                    direction = -1,                           
                    name="Cumalative Deaths",                            
                    limits=c("6","5","4","3","2","1"), #order the group
                    labels=c(">30000","10001-30000",
                             "5001-10000","1001-5000",
                             "101-1000","<=100")) + # chage the label
   labs(title = "COVID-19 U.S. Cumulative Deaths by December 12, 2021", #add title
        caption="Source: Centers for Disease Control and Prevention")+ #caption
   theme_statebins(legend_position = "right")

V.Facet Map

A facet map is faceting the geographic regions and draws the data according to the corresponding geographic location. The graph can intuitively compare the distribution of data in different regions.

Package: Facet map uses the geofacet package to perform faceting according to different geographic areas, and make the relative position of each area consistent with the actual geographic location.

library(ggplot2)
library(geofacet)

Now I am using the election dataset in the geofacet which is the results from 2016 election

head(election)
##     state candidate   votes       pct
## 1 Alabama   Clinton  729547 34.357946
## 2 Alabama     Trump 1318255 62.083092
## 3 Alabama     Other   75570  3.558962
## 4  Alaska   Clinton  116454 36.550871
## 5  Alaska     Trump  163387 51.281512
## 6  Alaska     Other   38767 12.167617
ggplot(election,aes(candidate,pct,fill=candidate))+
  geom_bar(stat = 'identity')+
  scale_y_continuous(limits = c(0,100),breaks = seq(0,100,30),minor_breaks = T)+
  scale_fill_manual(values = c("#4f6f57", "#99ccb3", "#03c03c"))+
  labs(title = '2016 US Election')+
  facet_geo(~state,grid =us_state_grid1 )+         ## facet by the geographic regions
  theme_bw()+
  theme(
    axis.text.x = element_blank()
  )
## Note: You provided a user-specified grid. If this is a generally-useful
##   grid, please consider submitting it to become a part of the geofacet
##   package. You can do this easily by calling:
##   grid_submit(__grid_df_name__)

VI. Connection Map

A connection map shows the connections between several positions on a map. The link between 2 locations is usually drawn using great circle: the shortest route between them. We are going to use the geosphere package.

library(ggplot2)
library(maps)
library(geosphere)
## 
## Attaching package: 'geosphere'
## The following object is masked from 'package:htmltools':
## 
##     span
airports <- read.csv("airports.csv",header = T) 
flights <- read.csv("flights.csv",header = T,as.is=TRUE)
dataVI <- read.csv("thebeltandroad.csv",header = T) 
dataVI
##            loc       long       lat type
## 1       Beijng 116.397128 39.916527    1
## 2         Xian 108.934250 34.230530    1
## 3      Lanzhou 103.718780 36.103960    1
## 4       Urumqi  88.311040 43.363780    1
## 5   Kazakhstan  71.465669 51.161882    1
## 6   Kyrgyzstan  74.567320 43.103590    1
## 7        Tajik  68.787038 38.568123    1
## 8   Uzbekistan  69.248888 41.314031    1
## 9         Iran  51.384950 35.691309    1
## 10      Turkey  32.856293 39.936684    1
## 11      Russia  37.615010 55.757770    1
## 12     Germany  13.403229 52.523688    1
## 13 Netherlands   4.885376 52.375914    1
## 14       Italy  12.478860 41.953351    1
## 15      Greece  23.701211 37.991796    2
## 16       Kenya  36.800086 -1.286858    2
## 17       India  78.083620 21.175326    2
## 18   Sri Lanka  80.622447  7.654822    2
## 19   Singapore 103.784644  1.287574    2
## 20    Malaysia 102.092093  4.321244    2
## 21 Philippines 121.354063 12.900646    2
## 22     Vietnam 105.697963 21.002657    2
## 23      Beihai 109.119860 21.502835    2
## 24      Haikou 110.334082 19.979834    2
## 25   Guangzhou 113.273050 23.143924    2
## 26    Quanzhou 118.663458 24.897163    2
## 27      Fuzhou 119.298165 26.082742    2

Mark the location for airports

data1 <- dataVI[dataVI$type == "1",]  #sea and land
data2 <- dataVI[dataVI$type == "2",] 
maps::map('world',
     col="grey", fill=TRUE, bg="white", lwd=0.05,  
     mar=rep(0,4), 
     border=0,   
     xlim=c(-20, 150), ylim=c(-20, 70)
)
points(x=data1$long, y=data1$lat, col="tomato", cex=1, pch=20)  #cex: size of point,pch: shape of the point
points(x=data2$long, y=data2$lat, col="slateblue", cex=1, pch=20)

maps::map('world',
     col="darkseagreen1", fill=TRUE, bg="white", lwd=0.05,  
     mar=rep(0,4), 
     border=0,   
     xlim=c(-20, 150), ylim=c(-20, 70)
)
points(x=data1$long, y=data1$lat, col="tomato", cex=1, pch=20)  #cex: size of point,pch: shape of the point
points(x=data2$long, y=data2$lat, col="slateblue", cex=1, pch=20) 
for (j in 1:(length(data1$loc)-1)) {   # the number of points on land -1
   loc1 <- data1[j,]  
   loc2 <- data1[j+1,] 
   inter <- gcIntermediate(  # calculate the distance
     c(loc1[1,]$long, loc1[1,]$lat), c(loc2[1,]$long, loc2[1,]$lat),  
     n=500, 
     addStartEnd=TRUE) 
     lines(inter, col="tomato", lwd=0.8)  # draw the line
}

for (j in 1:(length(data2$loc)-1)) {   
   loc1 <- data2[j,]  
   loc2 <- data2[j+1,] 
   inter <- gcIntermediate(c(loc1[1,]$long, loc1[1,]$lat), c(loc2[1,]$long, loc2[1,]$lat), n=500, addStartEnd=TRUE)  
    lines(inter, col="slateblue", lwd=0.8)  
}
inter <- gcIntermediate(c(dataVI[14,]$long, dataVI[14,]$lat), c(dataVI[15,]$long, dataVI[15,]$lat), n=500, addStartEnd=TRUE)  
lines(inter, col="purple", lwd=0.8)  # Intersection at Italy and Greece
text(dataVI$loc, x=dataVI$long, y=dataVI$lat,  col="black",cex=0.5, pos=3)

Domestic flights network of Delta Airlines

pal <- colorRampPalette(c("#333333", "white", "#00bfff"))  #gradient: black-grey-white-blue
colors <- pal(100) 
maps::map("world", col="gray17", fill=TRUE, bg="#000000", # dark background
    lwd=0.05,xlim=c(-171.738281, -56.601563), ylim=c(12.039321, 71.856229))  
fsub <- flights[flights$airline == "DL",]  
maxcnt <- max(fsub$cnt) #maximum of the number of flights
fsub <- fsub[order(fsub$cnt),]  
for (j in 1:length(fsub$airline)) {        
    air1 <- airports[airports$iata == fsub[j,]$airport1,]  
    air2 <- airports[airports$iata == fsub[j,]$airport2,] 
    inter <- gcIntermediate(c(air1[1,]$long, air1[1,]$lat), c(air2[1,]$long, air2[1,]$lat), n=100, addStartEnd=TRUE)  
    colindex <- round( (fsub[j,]$cnt / maxcnt) * length(colors) ) 
              lines(inter, col=colors[colindex], lwd=0.8)
}

VII. Google Map via ggmap and APIs

library(ggmap)
library(curl)
library(httr)
library(dplyr)

Instructions for Google platform is at https://www.littlemissdata.com/blog/maps First, we need to us get_map() function to obtain the information of the address. Then draw the map by ggmap(). The function qmap() is like the combination of get_map and ggmap. It can help us get the map directly.

UClayer_goo_ter <- get_map(location = "University of Chicago", zoom = 15, source = "google", maptype = "terrain")
UClayer_goo_sat <- get_map(location = "University of Chicago", zoom = 15, source = "google", maptype = "hybrid")
UClayer_goo_roa <- get_map(location = "University of Chicago", zoom = 15, source = "google", maptype = "roadmap")
UClayer_sta_ter <- get_map(location = "University of Chicago", zoom = 15, source = "stamen", maptype = "terrain")
UClayer_sta_wat <- get_map(location = "University of Chicago", zoom = 15, source = "stamen", maptype = "watercolor")
UClayer_sta_ton <- get_map(location = "University of Chicago", zoom = 15, source = "stamen", maptype = "toner")
Chicagolayer <- get_googlemap(center = c(-84.50, 39.137580),
                         zoom = 12, scale = 2,
                         maptype ='terrain',
                         color = 'color')
Houstonlayer <- get_map(location = "Houston, TX", zoom = 11, source = "google", maptype = "roadmap")
USA_lon_lat <- c(left = -128, bottom = 23, right = -65, top =52)
USAlayer <- get_stamenmap(USA_lon_lat, zoom = 5)
q1=qmap('Chicago', zoom=12, maptype='roadmap')
library(grid)
library(gridExtra)
g1=ggmap(UClayer_goo_ter,extent = 'device') + ggtitle("Google Maps Terrain") 
g2=ggmap(UClayer_goo_sat,extent = 'device') + ggtitle("Google Maps Satellite")
g3=ggmap(UClayer_goo_roa,extent = 'device') + ggtitle("Google Maps Road")
g4=ggmap(UClayer_sta_ter,extent = 'device') + ggtitle("Stamen Maps Terrain")
g5=ggmap(UClayer_sta_wat,extent = 'device') + ggtitle("Stamen Maps Watercolor")
g6=ggmap(UClayer_sta_ton,extent = 'device') + ggtitle("Stamen Maps Toner")
grid.arrange(g1,g2,g3,g4,g5,g6,nrow = 2)

violent_crimes <- subset(crime, offense != "auto theft" & offense != "theft" & offense != "burglary")
violent_crimes$offense <- factor(violent_crimes$offense, 
                                 levels = c("robbery", 
                                            "aggravated assault", 
                                            "rape", 
                                            "murder"))
head(violent_crimes)
##                      time     date hour premise            offense  beat
## 82729 2010-01-01 01:00:00 1/1/2010    0     18A             murder 15E30
## 82730 2010-01-01 01:00:00 1/1/2010    0     13R            robbery 13D10
## 82731 2010-01-01 01:00:00 1/1/2010    0     20R aggravated assault 16E20
## 82732 2010-01-01 01:00:00 1/1/2010    0     20R aggravated assault  2A30
## 82733 2010-01-01 01:00:00 1/1/2010    0     20A aggravated assault 14D20
## 82757 2010-01-01 02:00:00 1/1/2010    1     13R            robbery  2A50
##           block    street type suffix number   month    day
## 82729 9600-9699   marlive   ln      -      1 january friday
## 82730 4700-4799 telephone   rd      -      1 january friday
## 82731 5000-5099  wickview   ln      -      1 january friday
## 82732 1000-1099   ashland   st      -      1 january friday
## 82733 8300-8399    canyon           -      1 january friday
## 82757 5200-5299    center   st      -      1 january friday
##                       location           address       lon      lat
## 82729    apartment parking lot   9650 marlive ln -95.43739 29.67790
## 82730 road / street / sidewalk 4750 telephone rd -95.29888 29.69171
## 82731        residence / house  5050 wickview ln -95.45586 29.59922
## 82732        residence / house   1050 ashland st -95.40334 29.79024
## 82733                apartment       8350 canyon -95.37791 29.67063
## 82757 road / street / sidewalk    5250 center st -95.41530 29.77119
HoustonCr <- ggmap(Houstonlayer)
HoustonCr + 
  geom_point(aes(x = lon, y = lat, colour = offense),
             data = violent_crimes,
             size=1,
             alpha=0.2)

HoustonCr + 
  stat_density2d(aes(x = lon, y = lat, fill = ..level.., alpha = ..level..),
                 size = 2, bins = 4, data = violent_crimes,
                 geom = "polygon")

HoustonCr + 
  stat_density2d(aes(x = lon, y = lat, fill = ..level..),
                 bins = 5, geom = "polygon",data = violent_crimes,alpha=0.4) +
  facet_wrap(~ offense)